1,584 research outputs found
Modeling Persistent Trends in Distributions
We present a nonparametric framework to model a short sequence of probability
distributions that vary both due to underlying effects of sequential
progression and confounding noise. To distinguish between these two types of
variation and estimate the sequential-progression effects, our approach
leverages an assumption that these effects follow a persistent trend. This work
is motivated by the recent rise of single-cell RNA-sequencing experiments over
a brief time course, which aim to identify genes relevant to the progression of
a particular biological process across diverse cell populations. While
classical statistical tools focus on scalar-response regression or
order-agnostic differences between distributions, it is desirable in this
setting to consider both the full distributions as well as the structure
imposed by their ordering. We introduce a new regression model for ordinal
covariates where responses are univariate distributions and the underlying
relationship reflects consistent changes in the distributions over increasing
levels of the covariate. This concept is formalized as a "trend" in
distributions, which we define as an evolution that is linear under the
Wasserstein metric. Implemented via a fast alternating projections algorithm,
our method exhibits numerous strengths in simulations and analyses of
single-cell gene expression data.Comment: To appear in: Journal of the American Statistical Associatio
Estimating label quality and errors in semantic segmentation data via any model
The labor-intensive annotation process of semantic segmentation datasets is
often prone to errors, since humans struggle to label every pixel correctly. We
study algorithms to automatically detect such annotation errors, in particular
methods to score label quality, such that the images with the lowest scores are
least likely to be correctly labeled. This helps prioritize what data to review
in order to ensure a high-quality training/evaluation dataset, which is
critical in sensitive applications such as medical imaging and autonomous
vehicles. Widely applicable, our label quality scores rely on probabilistic
predictions from a trained segmentation model -- any model architecture and
training procedure can be utilized. Here we study 7 different label quality
scoring methods used in conjunction with a DeepLabV3+ or a FPN segmentation
model to detect annotation errors in a version of the SYNTHIA dataset.
Precision-recall evaluations reveal a score -- the soft-minimum of the
model-estimated likelihoods of each pixel's annotated class -- that is
particularly effective to identify images that are mislabeled, across multiple
types of annotation error.Comment: ICML Workshop on Data-centric Machine Learning Research 202
ObjectLab: Automated Diagnosis of Mislabeled Images in Object Detection Data
Despite powering sensitive systems like autonomous vehicles, object detection
remains fairly brittle in part due to annotation errors that plague most
real-world training datasets. We propose ObjectLab, a straightforward algorithm
to detect diverse errors in object detection labels, including: overlooked
bounding boxes, badly located boxes, and incorrect class label assignments.
ObjectLab utilizes any trained object detection model to score the label
quality of each image, such that mislabeled images can be automatically
prioritized for label review/correction. Properly handling erroneous data
enables training a better version of the same object detection model, without
any change in existing modeling code. Across different object detection
datasets (including COCO) and different models (including Detectron-X101 and
Faster-RCNN), ObjectLab consistently detects annotation errors with much better
precision/recall compared to other label quality scores.Comment: ICML Workshop on Data-centric Machine Learning Researc
Utilizing supervised models to infer consensus labels and their quality from data with multiple annotators
Real-world data for classification is often labeled by multiple annotators.
For analyzing such data, we introduce CROWDLAB, a straightforward approach to
estimate: (1) A consensus label for each example that aggregates the individual
annotations (more accurately than aggregation via majority-vote or other
algorithms used in crowdsourcing); (2) A confidence score for how likely each
consensus label is correct (via well-calibrated estimates that account for the
number of annotations for each example and their agreement,
prediction-confidence from a trained classifier, and trustworthiness of each
annotator vs. the classifier); (3) A rating for each annotator quantifying the
overall correctness of their labels. While many algorithms have been proposed
to estimate related quantities in crowdsourcing, these often rely on
sophisticated generative models with iterative inference schemes, whereas
CROWDLAB is based on simple weighted ensembling. Many algorithms also rely
solely on annotator statistics, ignoring the features of the examples from
which the annotations derive. CROWDLAB in contrast utilizes any classifier
model trained on these features, which can generalize between examples with
similar features. In evaluations on real-world multi-annotator image data, our
proposed method provides superior estimates for (1)-(3) than many alternative
algorithms
- …